Error Correcting Romaji-kana Conversion for Japanese Language Education
نویسندگان
چکیده
We present an approach to help editors of Japanese on a language learning SNS correct learners’ sentences written in Roman characters by converting them into kana. Our system detects foreign words and converts only Japanese words even if they contain spelling errors. Experimental results show that our system achieves about 10 points higher conversion accuracy than traditional input method (IM). Error analysis reveals some tendencies of the errors specific to language learners.
منابع مشابه
The functional unit of Japanese word naming: evidence from masked priming.
Theories of language production generally describe the segment as the basic unit in phonological encoding (e.g., Dell, 1988; Levelt, Roelofs, & Meyer, 1999). However, there is also evidence that such a unit might be language specific. Chen, Chen, and Dell (2002), for instance, found no effect of single segments when using a preparation paradigm. To shed more light on the functional unit of phon...
متن کاملLarge Scale Collocation Data and Their Application to Japanese Word Processor Technology
Word processors or computers used in Japan employ Japanese input method through keyboard stroke combined with Kana (phonetic) character to Kanji (ideographic, Chinese) character conversion technology. The key factor of Kana-to-Kanji conversion technology is how to raise the accuracy of the conversion through the homophone processing, since we have so many homophonic Kanjis. In this paper, we re...
متن کاملUnsupervised Learning of Dependency Structure for Language Modeling
This paper presents a dependency language model (DLM) that captures linguistic constraints via a dependency structure, i.e., a set of probabilistic dependencies that express the relations between headwords of each phrase in a sentence by an acyclic, planar, undirected graph. Our contributions are three-fold. First, we incorporate the dependency structure into an n-gram language model to capture...
متن کاملKana-Kanji Conversion System with Input Support Based on Prediction
1 I n t r o d u c t i o n TOSHIBA developed the world's first Japanese word processor in 1978. Unlike languages based on an alphabet , Japanese uses /,housands of Ica nji characters of varying comp]exity. Hence, l,o arrange all of l~a'~:ii chm'acl;ers on keyboard is; difficult. On the other hand, kana dlaracters which are phonetic scripl,s of Japanese have 83 variations; these can be arranged o...
متن کاملExploiting Headword Dependency and Predictive Clustering for Language Modeling
This paper presents several practical ways of incorporating linguistic structure into language models. A headword detector is first applied to detect the headword of each phrase in a sentence. A permuted headword trigram model (PHTM) is then generated from the annotated corpus. Finally, PHTM is extended to a cluster PHTM (C-PHTM) by defining clusters for similar words in the corpus. We evaluate...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011